Using the OntoGene pipeline for the triage task of BioCreative 2012

نویسندگان

  • Fabio Rinaldi
  • Simon Clematide
  • Simon Hafner
  • Gerold Schneider
  • Gintare Grigonyte
  • Martin Romacker
  • Thérèse Vachon
چکیده

In this article, we describe the architecture of the OntoGene Relation mining pipeline and its application in the triage task of BioCreative 2012. The aim of the task is to support the triage of abstracts relevant to the process of curation of the Comparative Toxicogenomics Database. We use a conventional information retrieval system (Lucene) to provide a baseline ranking, which we then combine with information provided by our relation mining system, in order to achieve an optimized ranking. Our approach additionally delivers domain entities mentioned in each input document as well as candidate relationships, both ranked according to a confidence score computed by the system. This information is presented to the user through an advanced interface aimed at supporting the process of interactive curation. Thanks, in particular, to the high-quality entity recognition, the OntoGene system achieved the best overall results in the task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ranking of CTD articles and interactions using the OntoGene pipeline

In this paper we briefly describe the architecture of the OntoGene Relation mining pipeline and its application in the task 1 of BioCreative IV. The aim of the task is to deliver information useful for the triage of abstracts relevant to the process of curation of the Comparative Toxicogenomics Database. Although the main focus of our text mining research is the extraction of interactions, we d...

متن کامل

Ontogene Term and Relation Recognition for CDR

For our participation in the CDR task of BioCreative 5, we have adapted the Ontogene System and optimized it for disease recognition (DNER Task) and identification of chemical-disease relationships (CID Task). For the DNER Task we have experimented with different changes to the term matching system. We describe the effects of an abbreviation detection tool as well as a selection of rules for te...

متن کامل

Evaluation of the CellFinder pipeline in the BioCreative IV User Interactive task

We present results on the participation of the CellFinder text mining pipeline for curation of gene/protein expression in anatomical parts in the BioCreative IV User Interactive task. The pipeline integrates state-of-the-art and freely available tools for the following steps: triage of potentially relevant documents, retrieval of documents, preprocessing, named-entity recognition, event extract...

متن کامل

CoIN: a network analysis for document triage

In recent years, there was a rapid increase in the number of medical articles. The number of articles in PubMed has increased exponentially. Thus, the workload for biocurators has also increased exponentially. Under these circumstances, a system that can automatically determine in advance which article has a higher priority for curation can effectively reduce the workload of biocurators. Determ...

متن کامل

Using binary classification to prioritize and curate articles for the Comparative Toxicogenomics Database

We report on the original integration of an automatic text categorization pipeline, so-called ToxiCat (Toxicogenomic Categorizer), that we developed to perform biomedical documents classification and prioritization in order to speed up the curation of the Comparative Toxicogenomics Database (CTD). The task can be basically described as a binary classification task, where a scoring function is u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2013  شماره 

صفحات  -

تاریخ انتشار 2013